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DETAILED ACTION 

Status of Claims 

1. Claims 1-10 are pending. 

Claim Rejections - 35 USC § 102 

2. The following is a quotation of the appropriate paragraphs of 35 U.S. C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 

3. Claims 1-10 are rejected under 35 U.S.C. 102(b) as being anticipated by Choi 
(USPN 2002/0042793, referred to as Choi). 

Claim 1, 3, 6, 8 and 10: 

Choi teaches a method for determining the semantic similarity of words in a 
plurality of words selected from a set of one or more documents, for use in the retrieval 
of information in an information system, comprising the steps of: 

(i) for each word of said plurality of words: 

(a) identifying, in documents of said set of one or more documents (Choi, 
H 0002: relevant documents), word sequences comprising the word and a 
predetermined number of other words (Choi, U 0002: query words); 
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(b) calculating a relative frequency of occurrence for each distinct word 
sequence among word sequences containing the word (Choi, 1J 0027: search and 
frequencies of the keywords); and 

(c) generating a fuzzy set comprising, for word sequences containing the 
word (Choi, 1f 0143: Clustering method includes k-nearest neighbor method), 
corresponding fuzzy membership values calculated from the relative frequencies 
determined at step (b) (Choi, U 0143: fuzzy method); and 

(ii) calculating and storing (Choi, Fig 4: Mutual Information Volume DB), for each 
pair of words of said plurality of words, using respective fuzzy sets generated at step 
(i), a probability that the first word of the pair is semantically suitable (Choi, 0002: 
degree of semantic similarity) as a replacement for the second word of the pair (Choi, 
U 0143: statistical similarity). 

Choi teaches the additional limitations of Claim 3, 6 and 8: 

an input for receiving a search query (Choi, U 0004: query words given by a 

user); 

generating means for generating a set of probabilities indicative of the semantic 
similarity of words selected from said set of one or more documents (Choi, 0002: 
semantic similarity) ; 

query enhancement means for modifying a received search query with reference, 
in use, to said generated set of probabilities (Choi, 0009: probability model); and 

information retrieval means (Choi, U 0008: information retrieval system) for 
searching said set of one or more documents for relevant information using a received 
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search query modified by said query enhancement means (Choi, U 0089: enhanced 
efficiency of search). 

Choi also teaches the added limitation of Claim 8: Generating in the form of a 
matrix (Choi, U 0106: cluster variables (entropy) results in a matrix of NXP). 

EN: Although claims 1 , 3, 6, 8 and 10 have certain syntactic differences, they are 
substantially similar in content, and hence the same rejections apply. They have been 
grouped for brevity. 

Claims 2, 5, 7: 

Choi teaches a method according to claim 1, further comprising the step of: 
(iii) adding a new document to said set of one or more documents (Choi, ^ 0171 : 
new data ... is input) and, using a set of words selected from said new document, 
performing an incremental update (Choi, U 0166: update connection strength) to said 
stored probabilities by means of steps (i) and (ii) performed in respect of said selected 
words using word sequences identified in said new document (Choi, ^ 0171 : produce a 
completely new class; EN: a 'new class' implies performing steps (i) and (ii) with 
respect to new word sequences). 

Claim 4: 

Choi teaches an information retrieval apparatus according to claim 3, wherein 
said query enhancement means are arranged to identify, with reference to said 
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generated set of probabilities, a word having a similar meaning to a term of said 
received search query and to modify said search query using said identified word 
(Choi, U 0087: probability distribution of data). 

Claim 8: 

8. An information processing apparatus for use in an information processing 
apparatus, for use in an information system, for identifying information sets 
associated with a predetermined information category, the apparatus comprising: 
generating means for generating, in the form of a matrix, a set of 
probabilities indicative of the semantic similarity of words selected from a 
sample set of one or more documents representative of the predetermined 
information category; calculating means arranged to calculate, for each 
information set, a vector of values representing the relative frequency of 
occurrence, in the information set, of words represented in a matrix generated 
by the generating means; and clustering means arranged to determine a measure 
of mutual similarity between pairs of information sets, using the respectively 
calculated vectors and the generated matrix, and to use the determined measures 
in a clustering algorithm to select one or more information sets to associate 
with the predetermined information category, wherein said generating means are 
. arranged, in use: (i) for each word selected from said sample set: (a) to 
identify, in documents of said sample set, word sequences comprising the word 
and a predetermined number of other words; (b) to calculate a relative 
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frequency of occurrence for each distinct word sequence among word sequences 
containing the word; and (c) to generate a fuzzy set comprising, for groups of 
word sequences containing the word, corresponding fuzzy membership values 
calculated from the relative frequencies determined at step (b); and (ii) to 
calculate, for each pair of words of said plurality of words, using respective 
fuzzy sets generated at step (i), a probability that the first word of the pair 
is semantically suitable as a replacement for the second word of the pair. 

Claim 9: 

Choi teaches an information processing apparatus according to claim 8, wherein 
the clustering algorithm is a hierarchic agglomerative clustering algorithm (Choi, tl 
0035: hierarchical clustering for a statistical similarity). 



Examinations Considerations 

1 . Examiner's Notes (EN) are provided with the cited references to prior art to assist 
the applicant to better understand the nature of the prior art, application of such prior art 
and, as appropriate, to further indicate other prior art that maybe applied in other office 
actions. Such comments are entirely consistent with the intent and spirit of compact 
prosecution. However, and unless otherwise stated, the Examiner's Notes are not prior 
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art but a link to prior art that one of ordinary skill in the art would find inherently 
appropriate. 

2. Examiner has cited particular columns and line numbers (or paragraphs) in the 
references applied to the claims above for the convenience of the applicant. Although 
the specified citations are representative of the teachings of the art and are applied to 
specific limitations within the individual claim, other passages and figures may apply as 
well. It is respectfully requested from the Applicant in preparing responses, to fully 
consider the references in their entirety as potentially teaching all or part of the claimed 
invention, as well as the context of the passage as taught by the prior art or disclosed 
by the Examiner. The entire reference is considered to provide disclosure relating to 
the claimed invention. 

Conclusion 

3. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

a. ■ Little, USPN 2002/0059220 cited for a search engine with matching based 
, on a fuzzy logic. 



4. 



Claims 1-10 are rejected. 
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published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

KB 

Nov 06, 2007 



EST. 




